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Abstract 



To make the development of efficient multi-core applications easier, 
libraries, such as Grand Central Dispatch, have been proposed. When 
using such a library, the programmer writes so-called blocks, which are 
chunks of codes, and dispatches them, using synchronous or asynchronous 
I— ^ calls, to several types of waiting queues. A scheduler is then responsible 

C/3 for dispatching those blocks on the available cores. Blocks can synchronize 

via a global memory. In this paper, we propose Queue-Dispatch Asyn- 
chronous Systems as a mathematical model that faithfully formalizes the 
synchronization mechanisms and the behavior of the scheduler in those 
^ systems. We study in detail their relationships to classical formalisms 

1-H such as pushdown systems, Petri nets, fifo systems, and counter systems. 

Our main technical contributions are precise worst-case complexity results 
00 for the Parikh coverability problem and the termination question for sev- 

\l eral subclasses of our model. We give an outlook on extending our model 

towards verifying input-parametrized fork-join behaviour with the help of 
abstractions. 

(N 

^ 1 Introduction 

The computing power delivered by computers has followed an exponential grow- 

^ ing rate the last decades. One of the main reasons was the steady increase of 

the CPU clock rates. This growth, however, has come to an end a few years 
ago, because further increasing the clock rate would incur major engineering 
challenges related to power dissipations. In order to overcome this and meet 
the continuous need for more computing power, multi-core CPU's have been 
introduced and are now ubiquitous. However, in order to harness the power 
of multiple cores, software applications need to be fundamentally modified and 
the programmers now have to write programs with parallelism in mind. But 
writing parallel programs is a notoriously difficult and error prone task. Also, 
writing efficient and portable parallel code for multi-core platforms is difficult, 
as the number of available cores will vary greatly from one platform to another, 



1 



Parikh coverability 




queue types 






concurrent 


serial 


both 


.a synchr. 


ExpTime-C 


PSpace-C 


ExpTime-C 


a asynchr. 


ExpSpace-C 


i 


U) 


both 


i 


ii) 


ii) 



Table 1: Qdas Verification Problems (i: "undecidable" , parentheses: directly 
derivable) 

and might also depend on the current load, the energy management policy, and 
so forth. 

In order to alleviate the task of the programmer, several high level program- 
ming interfaces have been proposed, and are now available on several operating 
systems. A popular example is Grand Central Dispatch, GCD for short, a tech- 
nology that is present in Mac OS X (since 10.6), iOS (since version 4), and 
FreeBSD. In GcD, the programmer writes so-called blocks which are chunks of 
codes, and send them to queues, together with several dependency constraints 
between those blocks (for instance, one block cannot start before the previous 
one in the queue has finished). The scheduler is then responsible for dispatch- 
ing those blocks on the available cores, through a thread pool that the sched- 
uler manages (thereby avoiding the explicit and costly creation/destruction of 
threads by the programmer that is in addition extremely error-prone). 

So far, to the best of our knowledge, no formal model has been proposed 
for systems relying on GCD or similar technologies, making those programs de 
facto out of reach of current verification methods and tools. This is particularly 
unfortunate as the control structure of such programs is rich and may exhibit 
complex behaviors. Indeed, the state-space of such programs is infinite even 
when types of variables are abstracted to finite domains of values. This is 
not surprising as asynchronous calls and recursive synchronous calls can send 
an unbounded number of blocks to queues. Also, those programs are, as any 
parallel program, subject to concurrency bugs that are difficult to detect using 
testing only. 

Contributions In this paper, we introduce Queue-Dispatch Asynchronous 

Systern,s, Qdas for short, as a formal model for programs written using libraries 
such as GCD. Our model is composed of blocks, that are finite transition sys- 
tems with finite data-domain variables that can do asynchronous (non-blocking) 
and synchronous (blocking) calls to other blocks (possibly recursively). How- 
ever, a call does not immediately trigger the execution of the callee: the block 
is inserted into a queue that can be either concurrent or serial. In concurrent 
queues, several blocks can be taken from the queue and executed in parallel, 
while in serial queues, a block can be dequeued only if the previous block in the 
queue has completed its execution. Queues are maintained with a fifo policy. 
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To formalize configurations of such systems, our formal semantics relies on call 
task graph, Ctg for short, in which nodes model tasks that are either in queues 
or executing, and edges model dependencies between tasks and within queues. 

We then study the decidability border for the Parikh coverability problem 
and the termination problem on several subclasses of Qdas. Our results are 
summarized in Table 1. The Parikh image of a Ctg is an abstraction that 
counts for each type and state of blocks the number of occurrences in the Ctg 
and the Parikh coverability problem asks for the reachability of a Ctg that 
contains at least a given number of blocks of each type that are in a given set of 
states. Not surprisingly, this problem is imdecidable for Qdas, but we identify 
several subclasses for which the problem is decidable. For those decidable cases, 
we characterize the exact complexity of the problem. 

The main positive decidability results with precise complexity arc as fol- 
lows: First, we show that Qdas with only synchronous calls are essentially 
equivalent to pushdown systems with finite domain data- variables, and we show 
that the Parikh coverability problem is ExpTime-C for synchronous concurrent 
Qdas (Theorem 1). Second, for synchronous Qdas with only serial queues, the 
problem is PSpace-C (Theorem 2). Third, we show that Qdas with only asyn- 
chronous calls and only concurrent queues arc essentially equivalent to lossy 
Petri nets and show that the Parikh coverability problem is ExpSpace-C for 
that class (Theorems). This decidability border is precise as we show that 
if we allow either (z) asynchronous calls with synchronous queries, or (ii) syn- 
chronous and asynchronous calls with concurrent queues, then the Parikh cover- 
ability problem becomes undecidable (Theorem 4 and Theorem 5). The previous 
proof's ideas allow to derive similar results for termination wrt. the subclasses 
of Qdas. The termination problem asks given a Qdas whether all its executions 
are finite. 

We enhance up our results by presenting an extension of Qdas with an 
explicit fork/join construct that, in addition, is parametrized by the input. As 
Parikh coverability and termination lifted to this setting are undecidable, we 
propose two over-approximations that allow for solutions in practice. 

Remark: Due to the lack of space, detailed formal proofs are deferred to the 
appendix. 

Related Works The basic model checking result for asynchronous programs 
is the ExpSPACE-hardness for the control-state reachability problem obtained by 
making formal a link with multi-set pushdown systems (Mpds). The underlying 
two basic ideas are : (i) to untangle the call stack and the storage of pending 
asynchronous calls by imposing that the next call in a serialized execution- 
equivalent program is only processed when the call stack is empty; and (ii) to 
only count the number of pending calls for each block while the call stack is 
non-empty. The original reduction in [17] is based on Parikh's theorem and 
derives the lower bound from a Petri net reachability problem [8] . A Parikh-less 
reduction was presented in [13] that relied on the convergence of an over- and 
under-approximation derived from interprocedural dataflow analysis. 
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The close relation between asynchronous programs and Petri nets can also be 
used to prove additional decidability results for liveness questions [11, 10]. The 
following results are based on a (polynomial-time) reduction of asynchronous 
systems to an "equivalent" Petri net or extension thereof: fair termination 
(i.e., testing whether each dispatched call terminates) is complete in ExpSpace, 
the boundedness question is decidable in ExpSpace (i.e., asking whether we 
can bound the number of pending calls), fair non-starvation (i.e., asking, when 
assuming fairness on runs, whether every pending call is eventually dispatched) 
is decidable. The authors also consider extensions of asynchronous programs 
with cancellation (i.e., an additional operation removing all pending instances 
of a block) and testing whether there is no pending instance of a given block. 
In the first case, they show reduction to the model to Petri nets with transfer 
arcs or reset arcs, in the second case they show reduction to Petri nets with one 
inhibitor arc. Multi-set pushdown automata are subsumed by well- structured 
transition systems with auxiliary storage and inherit their decidability results 
presented in [6, 7]. Analogously, one can show that termination, control-state 
maintainability, and simulation with respect to finite state systems are decidable 
for asynchronous programs. 

All the models considered in the aforementioned publications do not consider 
causality constraints on the sequence of asynchronous dispatch calls, as would 
be necessary to model the fifo policies of Geo. However, this is possible with 
Qdas. a more detailed look on the differences between the model of [10] and 
the (fifo-less) subclass of asynchronous serial Qdas is presented in Section 4. 

A series of parallel programming libraries and techniques is formalized in [3] 
with the help of recursively parallel program,s. These allow to model fork/join 
based parallel computations based on a reduction to recursive vector addition 
systems with states. With respect to Qdas and asynchronous programming, 
recursively parallel programs only cover the classical asynchronous models pre- 
sented above and not the advanced scheduling strategies for different queues 
that introduce more sophisticated behaviours. 

2 Preliminaries 

Grand Central Dispatch (Geo) is a technology developed by Apple [1, 2] 

that is publicly available at http://libdispatch.macosforge.org/ under a 
free license. GCD is the main inspiration for the formal model of queue-dispatch 
asynchronous systems. In the following, we often present our examples as pseudo 
code using a syntax inspired by GCD. In the GCD framework, the programmer 
has to organize his code into blocks. During the execution of a GCD program, 
one or several tasks run in parallel, each executing a given block (initially, only 
the main block is running). Tasks can call (or dispatch in the GCD vocabulary) 
other blocks, either synchronously (the call is blocking), or asynchronously (the 
call is not blocking). A dispatch consists in inserting the block into a fifo queue. 
In our examples, we use the keywords dispatcha and dispatchs to refer to 
asynchronous and synchronous dispatches respectively. At any time, the sched- 
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1 global int const l,m,n 

2 global int [1] [m] matrixl , int [m] [n] niatrix2 , int [1] [n] matrix 

3 global c_queue workqueue , s_quGUG semaphore , int count 

4 block increase () : 

5 count = count + 1 

6 block one_cellCint i, int j): 

7 for k in range Cm); 

8 matrix[i][j]+= matrixl[i][k] * matrix2[k][j] 

9 dispatch_s (semaphore , increase ()) 

10 def main C ) : 

11 // read input matrix! , matrix2 

12 count = 

13 for i in ranged): 

14 for j in range Cn); 

15 dispatch_a(workqueue ,one_cellCi,j)) 

16 wait (count = l*n) 

17 // print the result 

Figure 1: GCD (-like) program for parallel matrix multiplication 

uler can decide to dequeue blocks from the queues and to assign them to tasks 
for execution. All queues ensure that the blocks are dequeued in fifo order, 
however the actual scheduling policy depends on the type of queue. GCD sup- 
ports two types of queues: concurrent queues allow several tasks from the same 
queue to run in parallel, whereas serial queues guarantee that at most one task 
from this queue is running. In our examples, concurrent (or serial) queues are 
declared as global variables of type c_queue (s_queue). In addition, all blocks 
have access to the same set of global variables (in this work, we assume that the 
variables range over finite domains). 

Example 1 Let us consider the pseudo code in Fig. 1 that computes the product 
of two integer matrices matrixl and matrix2 of constant size (l,m,n) in a 
matrix matrix. The main task forks a series of one^cell blocks. Each one^cell 
computes the value of a single cell of the result. The parallelism is achieved via 
the GCD scheduler, thanks to asynchronous dispatches on the concurrent queue 
workqueue. Asynchronous dispatches are needed to make sure that main is not 
blocked after each dispatch, and a concurrent queue allow all the one_cell block 
to run in parallel. The variable count is incremented each time the computation 
of a cell is finished and acts as a semaphore for the main block, to ensure that 
matrix contains the final result. As only reading and writing to a variable are 
atomic, we need to guarantee exclusive access of two consecutive operations on 
count (line 5). This is achieved by a dedicated block increase that is dispatched 
to the serial queue semaphore. As only increase blocks can increase count, 
this queue implicitly locks the access to the variable. Moreover, the synchronous 
dispatch in line 9 guarantees that a block terminates only after it has increased 
count. 

Basic Notations: Given a set S, let |5| denote its cardinality. For an I- 
indexcd family of sets (S'i)ig/, we write elements of Jlie/ "^i bold face, i.e.. 
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s G Yiiei '^i' "^^^ 2-componcnt of s is written Si G Si, and we identify s with 
the indexed family of elements (si)ie/. We use U to denote the disjoint union of 
sets. An alphabet E is a finite set of letters. We write S* for the set of all finite 
words, over E and denote the empty word by e. The concatenation of two words 
w,w' is represented by w ■ w' . For a letter cr S S and a word w G E*, let \w\^ 
be the number of occurrences of ct in w. We use standard complexity classes, 
e.g., polynomial time (PTime) or deterministic exponential time (ExpTime), 
and mark completeness by appending "-C" (PSpace-C). 

Let D be a finite data domain with an initial element do G D, and let 
be a finite set of variables ranging over V. A valuation of the variables in is a 
function d : A" — >■ D. An atom is an expression of the form x = d or x ^ d, where 
X G X and d G B. A guard if a finite conjunction of atoms. An assignment 
is an expression of the form x ^ v, where x E X and w G D. Let guards 
assign {X) and vals(A') denote respectively the sets of all guards, assignments 
and valuations over variables from X. Guards, atoms and valuations have their 
usual semantics: for all valuations d of X and all g G guards {X), we write d\= g 
iff d satisfies g. 

A pushdown system with data is a pushdown system (see [4] for details) 
equipped with a finite set of variables X over a finite domain A configuration 
of a Pds with data is a pair (s, w, d) where s is a control state, w is the stack 
content, and d is a valuation of the variables 

Proposition 1 The reachability problem is ExpTime-C for Pds with data. 

A Petri net (Pn) is a tuple N = {P, T, mo) where P is a finite set of places, a 
marking of the places is function m : P ^ N that associates, to each place p G P 
a number m{p) of tokens, T is finite set of transitions, each transition t £ T 
is a pair {It,Ot) where It : P ^ {0,1} and Ot : P ^ {0,1} are respectively 
the input and output functions of t, and mo is the initial marking. Given two 
markings mi and m2 , we let mi ^ iff mi (p) < m2 (p) for all p € P. Given 
a marking m, a transition t = {It,Ot) is enabled in m iff m{p) > It{p) for 
all p G P. When t is enabled in m, one can fire the transition t in m, which 
produces a new marking m' s.t. m' [p) = m{p) — It{p) + Ot{p) for all p. This is 
denoted m m! , or simply m ^ m' when the transition identity is irrelevant. 
A run is a finite sequence ruomi . . . nin s.t. for all 1 < z < n: mi_i m^. For a 
Pn N, we denote by Reach{N) (rcsp. Cover(N)) the reachability (coverability) 
set of N , i.e. the set of all markings m s.t. there exists a run momi . . . m^ 
of N with m = m„ (m < nin). The coverability problem asks, given a Pn A'' 
and a marking m, whether m G Cover{N). It is ExpSPACE-complete [8]. The 
termination problem, i.e., whether all executions of the Petri net are finite, is 
decidable in ExpSpace-C [15, 16]. 

3 Queue-dispatch asynchronous systems 

Syntctx: We now define our formal model for queue-dispatch asynchronous 
systems. Let D be a finite data domain containing an initial value do. A queue- 
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dispatch asynchronous system (Qdas) ^ is a tuple {CQID, SQID, F, main, X, E, {TS-y)-^^r) 
where: 

• CQID and SQID are respectively sets of (c)oncurrent and (s)erial queues; 

• r is the finite set of blocks and main G T the initial block. Each block 7 e F 
is a tuple (S'^, s^, /y, E, A^) where (S^, s°, S, A^) is an Lts and G S a 
distinct final state; 

• A" is a finite set of D- valued variables; 

• S is the set of actions, with S = ({dispatchg, dispatcha} x (CQIDU 
SQID) xT\ {main}) U guards {X) U assign {X). 

We assume that SQID, CQID,T , X , and all S^ for 7 G F arc disjoint from 
each other. Let S — lJ^gr<S'7, F = lJ7er{/7}, A = [J^grA-y, and QID = 
SQIDuCQIDUii} (where i ^ SQID U CQID). We further assume that 

Call-task graphs: We formalize the semantics of Qdas using the notion of 
call-task graph (Ctg) to describe the system's global configurations. 

A configuration of a Qdas (see Fig. 2 for an example) contains a set of 
running tasks, represented by task vertices (depicted by round nodes), a set of 
called but unscheduled blocks, represented by call vertices (square nodes). Call 
vertices are held by queues, and the linear order of each queue is represented 
by queue edges (solid edges). Synchronous calls add an additional dependency 
(the caller is waiting for the termination of the callee) that is represented by a 
wait edge (dashed edges) between the caller and the callee. Wait edges are also 
inserted between the head of a serial queue and the running task that has been 
extracted from this queue (if it exists) to indicate that the task has to terminate 
before a new block can be dequeued. Note that only vertices without outgoing 
edges can execute a computation step, the others are currently blocked. Each 
node V is labeled by a block \{y), an by the identifier queue(v) of the queue 
that contains it (for call vertices) or that contained it (for task vertices). Task 
vertices are labeled by their current state state{v) (for convenience, we also 
label call vertices by the initial state of their respective blocks - not shown in 
the figure). 

Example 2 The Ctg in Fig. 2 depicts a configuration of a Qdas with two 
queues. Queue q2 is serial (note the outgoing wait edge to the running task) 
and contains 727272,- and qi is parallel with content 7172. There are 4 active 
tasks, two of them (main and the task running 71^ are blocked. The task running 
73 has been dequeued from q2 and is currently at location s. H 

Formally, given a Qdas A = {CQID, SQID, T, main, X,'E,{TSj)j^t), a 
call-task graph over ^ is a tuple = {V, E, A, queue, state) where: V = 
Vc^Vt is a finite set of vertices, partitioned into a set Vc of call vertices and 
a set Vt of task vertices; £^ C F x F is a set of edges; \ : V ^ T labels each 
vertex by a block; queue : V — > QID U {z} associates each vertex to a quelle 
identifier (or i); and state : V S associates each vertex to a Lts state. For 
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queue qi ^ '^^ 

iitr^^iii, @l 

Figure 2: Ctg for a Qdas with a concurrent queue qi and a serial queue (72 

each q e QID, let Vq = {v & V\ queue{v) = q). The set E is partitioned into 
the set E-w^ of wait edges and the set Eq = Ugegjc Eq of queue edges where, 
for each q G QID, Eg ^ E n {Vq x Vq). 

A Ctg is empty iff F = 0. The Parikh image Parikh(G) of a Ctg G of ^ is 
a function / : 5 — )• N, s.t. for all s G S: f{s) = \{v gV \ state{v) = s} |. Given 
two Parikh images Parikh(G) and Parikh(G'), we let Parikh(G) ^ Parikh(G') iff 
for ah s G 5: Parikh(G)(s) < Parikh(G')(s). A path (of length n) in G^ is a 
sequence of vertices vq,v\, . . . ,Vn s.t. for all 1 < i < n: {vi-\,Vi) G E. Such a 
path is simple iff Vi ^ Vj for all 1 < i < j < n. The restriction of G^ ioV 
is the Ctg G^ = {V , E', A', queue', state'), where E' = E n {V x V), and A', 
queu^ and siaie' are respectively the restrictions of A, queue and state to V. 

In the rest of the paper, we assume that all the Ctg we consider are well- 
formed, i.e., they fulfill the following requirements: 

1. For each v (zVt- state{v) G S'a(«) where Sx(^y) are the states of TSx^^y 

2. Each call vertex has at most one outgoing (queue or wait) edge, at most 
one incoming wait edge, and at most one incoming queue edge. Each task 
vertex has at most one outgoing, and at most one incoming wait edge. 

3. For each q G QID, the restriction of to Vq is either empty or contains 
one and only one simple path of length \Vq\ — 1. Intuitively, this ensures 

the well-formedness of the queues. 

4. For each q G SQID, there is at most one task vertex v s.t. queue{v) = q. 
This ensures that queues in SQID indeed force the serial execution of its 
members. 



A('u) = 73 
state{v) — s 
queue{v) = q2 



For convenience, we also introduce the following notations. Let G_a. be a 
Ctg, and let g be a queue identifier of A. Then, head{q, G_a.) and tail{q, G^) 
denote respectively the head and the tail of q in the configuration described by 
G^, that is, head{q, G^) is the call vertex v € Vq that has no incoming queue 
edge, or _L, if such a vertex does not exist; and head{q, G_a_) is the call vertex 
V GVq that has no outgoing queue edge (but possibly an outgoing wait edge) , or 
_L, if such a vertex does not exist. Remark that, when they exist, these vertices 
are necessarily unique because of the well-formedness assumptions. Finally, we 
say that a vertex v is unblocked iff it has no outgoing edge, and that it is final iff 
(z) V is an unblocked task vertex and («?') state{v) = f\(y) (that is, v represents 
a task that has reached the final state of its transition system and is not waiting 
on another task). 
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Let us now define several operations on Ctg. We will rely on these oper- 
ations when defining the formal semantics of Qdas. Let ^ be a Qdas and 
Ga = {V, E, A, queue, state) be a Ctg for A. Then: 

• for all u e F: G \ t; is the restriction oiG ioV\ {«}. 

• for all 7 G r and q e QID, enqueue(g, 7)(G^) is the Ctg {V , E' , A', queue! , state) 
where: V = V U {v'}, v' is a fresh queue vertex, A(i/) = 7, queue{v') = q, 
state{v') = s°, and for all v gV: X'{v) = X{v) and queue' {v) — queue{v). Fi- 
nally, E' = EUE1UE2, where: (i) Ei = {{v', tail{GA,q))} iitail{GA,q) -L, 
and -El = otherwise, and [ii) if w e F is a task node s.t. queueiv) = q G 
SQID, then E2 = {{v',v)}, otherwise E2 = 0. Intuitively, this operation 
inserts a call to 7 in the queue q, by creating a new vertex u' and adding 

an edge to maintain the FIFO ordering, if necessary (set Ei). In the case 
of a serial queue that was empty before the enqueue, a supplementary edge 
(in set E2) might be necessary to ensure that v' is blocked by a currently 
running v which has been extracted from q. 

• for all q S QID, if head{q) is different from _L and unblocked, then dequeue((j')(G'^) 
is the Ctg {V^\:}V^,E' ,\, queue, state) where = Vc\ {head{q)} and 

= V^U {head(q)}. Otherwise, head{q) = _L and dequeue((7)(G^) is unde- 
fined. Intuitively, this operation removes the first (with respect to the FIFO 
ordering) block from q and turns the corresponding call vertex head{q) into 
a task vertex, meaning that the block is now running as a task. 

• for all S = (s,a, s') € A, step((5)(G^) is a set of Ctg defined as follows. 
{V, E, X, queue, state') € step(5)(G^) iff there exists an unblocked v G Vr 
s.t. state{v) = s, state! {v) = s' and for all v' ^ v: state'{v') = state{v'). Re- 
mark that step((5)(G^) can be empty. Intuitively, each graph in step(5)(G^) 
corresponds to the firing of an a-labeled transition by a task that is not 
blocked. 

• for all unblocked f G V U {-L}, all v' € V: letwait(f , i;')(G^) is cither the 
Ctg G^ if ti = _L, or the Ctg {V,E U {v,v'),X, queue, state) \i v ^ L. 
Intuitively, this operation adds a wait edge between nodes v and v' when 
w ^ _L, and does not modify the Ctg otherwise. 

Semantics of Qdas: For a Qdas A with set of variables X, a configura- 
tion is a pair {G,d), where G is a Ctg of A and d € vals(A'). The oper- 
ational semantics of A is given as a transition system |^] whose states are 
configurations of A; and whose transitions reflect the semantics of the ac- 
tions labeling the transitions of the Qdas. Formally, given a Qdas A = 
{CQID, SQID, r, main, X,T,, (TS^/)j^r) , l-^i is the labeled transition system 
(G, c°, E, =>) where: (i) G contains all the pairs (G, d) where d S vals (X), and 
G is a Ctg of A, (ii) 0° = (G^,*?) with S^{x) = do for all x G X, and G° = 
{{v^},f!), X, queue, state) , where is a task node, X{v^) = main, state{v^) = 
s^^i^ and queue{v^) = i, (m) E = EU{£} and [iv] {{G,d),a, {G',d')) iff 
one of the following holds: 
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Async. dispatch: a ~ dispatcha((j, 7), d' = d. and there are 5 = {s, a, s') € A 

and G" e step((5)(G) s.t.: G' = enqueue((7, 7)(G"). 

Sync, dispatch: a = dispatchs(g, 7), d' = d and there are 6 = {s,a,s') S A 
and G" G step(5(G)) s.t.: G' = letwait(ii,ii')(enqueue(g,7)(G")) where v 
is the node whose state has changed during the step operation, and v' is 
the fresh node that has been created by the enqueue operation. That is, 
a queue vertex v' labeled by 7 is added to q and a wait edge is added 
between the node v representing the task that performs the synchronous 
dispatch, and v', as the dispatch is synchronous. 

Test: a = g & guards (Af), d' = d, d \= g, and there is ^ = {s,a,s') G A s.t. 
G' G step((5)(G). 

Assignment: a = a; w G assign (A"), d'{x) = v, for all x' ^ x: d'{x) = d{x) 
and there is ^ = (s, a, s') G A s.t. G' G step((5)(G). 

Scheduler action: a = e, d' = d and: 

• cither there is a final vertex v s.t. G' = G \ w; 

• or there is g G GQID s.t. head{q,G) ^ _L and G' = dequeue((j')(G). 
That is, the scheduler schedules a block (represented by v) from a 
concurrent queue. 

• or there is 5 G SQID s.t. head{q, G) = v, v is unblocked, as well 
as G' = letwait(/iead(g,G"),w)(G") and G" = dequeue(g)(G). That 
is, the scheduler schedules a block (represented by v) from the serial 
queue q. As the queue is serial, a wait edge is inserted between the 
next waiting block in q (now represented by head{q, G")) and v. 

A run p of a Qdas is an alternating sequence CQaiC\a2 ■ ■ -OnCn of configu- 
rations and actions where (cj, aj+i, Ci+i) G=> for all < i < n and cq = c°. A 
run is finite if this sequence is finite. A configuration c is reachable in ^ iff there 
exists a finite run coaiCia2 . . . a„c„ of A s.t. c„ = c. We denote by Reach{A) 
the set of all reachable configurations of A. 

The decision problem on Qdas we mainly consider in this work is the Parikh 
coverability problem: given a Qdas A with set of locations S and a function 
/ : 5 I-)- N, it asks whether there is c = {G,d) G Reach{A) s.t. / < Parikh(G). 
When the answer to this question is 'yes', wc say that / is Parikh- coverable in 
A. It is well-known that meaningful verification questions can be reduced to this 
problem. For instance, consider a mutual exclusion question, asking whether it 
is possible to reach, in a Qdas A, a configuration in which at least two tasks 
are executing the same block 7 and are in the same control state s. If yes, the 
mutual exclusion (of control state s) is violated. This can be encoded into an 
instance of the Parikh coverability problem, where /(s) = 2 and / (s') = for 
all s' ^ s, and would allow, for example, to verify if there are more than one 
block of type increase running in Example 1. 
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Figure 3: The two possible forms of reachable Ctgs in a synchronous Qdas 

In addition, we look at the (universal) termination problem: given a Qdas 
A, it asks whether all executions of A are finite, i.e., there is no infinite run of 
A. Regarding Example 1, this permits to test whether the main task terminates, 
i.e., all dispatched blocks terminate. 

4 From the Parikh coverability problem to Ter- 
mination 

Before regarding the termination problem, we first study in this section the 
Parikh coverability problem from a computational point of view. As expected, 
this problem is undecidable in general. However, when restricting the types of 
queues and dispatches that are allowed, it is possible to retain decidability. In 
these cases, we characterize the complexity of the problem. Formally, we con- 
sider the following subclasses of Qdas. A Qdas A with set of transitions A, 
set of serial queues SQID and set of concurrent queues CQID, is synchronous 
iff there exists no (s,a, s') G A with o S {dispatch_a} x QID x F; it is asyn- 
chronous iff there exists no (s, a, s') € A with a € {ciispatch_s} x QID x F; 
it is concurrent iff SQID = and CQID ^ 0; it is serial iff CQID = and 
SQID ^ 0; it is queueless iff CQID = SQID = 0. 

Queueless Qdas: In a queueless Qdas, there is no dispatch possible, so the 
only task that can execute at all time is the main one. Thus, configurations of 
queueless Qdas can be encoded as tuples (,s, d), where s is a state of main, and 
d is a valuation of the variables. Hence queueless Qdas are essentially Lts with 
variables over a finite data domain, thus: 

Proposition 2 The Parikh coverability is PSpace-C for queueless Qdas. 

Synchronous Qdas: In synchronous Qdas, there is no concurrency in the 
sense there is at most one running task that can fire an action at all times. All 

the other tasks have necessarily performed a synchronous dispat(;li and are thus 
blocked. More precisely, in every reachable configuration (G, d) of a synchronous 
Qdas, G is of one of the forms depicted in Fig. 3 (i.e. vq, . . . ,Vn-i S Vr and 
either u„ G Vp or u„ e Vc)- When the cm-rent Ctg is of the form Fig. 3(a), 
the only possible action is that the scheduler starts running VnS block and we 
obtain a graph of the form Fig. 3(b). In the case where the Ctg is of the 
form (a), either i;„ terminates, which removes w„ from the Ctg, or Vn executes 
an internal action, which does not change the shape of the Ctg, or u„ does 
a synchronous call, which adds a call vertex as successor of Vn which will be 
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directly scheduled. W.l.o.g., wc assume in the following that for synchronous 
Qdas the combined action of dispatchgand scheduling the dispatched block is 
atomic. 

For a Ctg G and w e S* , wc write Gt>w iff for all < i < n: Wi = state{vi) 
and the empty Ctg is mapped to the empty word e. Given a synchronous Qdas 
A with set of local states 5 as before, we can build a pushdown system with 
data Va such that, at all times, the current location of P4 encodes the current 
location of the (single) running block in A, and the stack content records the 
sequence of synchronous dispatches, as described above. A guard or assignment 
in A is kept as is in Vj^. A synchronous dispatch (s, dispatchs((j, 7), ,s') in A 
is simulated by a push of s' (to record the local state that has to be reached 
when the callee terminates) and moves the current state of Va to the initial 
state of 7. The termination of a block is simulated by a pop (and we encode 
the termination of main in testing the stack's emptiness). 

Proposition 3 Given a synchronous Qdas A, then we can construct a, push- 
down system with data Va such that the following holds: for any run p = 
cofliCi . . . a„c„ of A, there exists a run tt = xoOiXi . . . a„Xri in Va such that for 
all Ci = {Gi, di) and Xi = (sj, Wi, d^) we have di = d^ and Git>Wi (0 < i < n), 
and vice versa. 

The previous proposition allows to derive results on the reachability problem. 
However, we are interested in the Parikh coverability problem. Let / be a Parikh 
image of A. Then, by Proposition 4, looking for a reachable configuration of 
A that covers / amounts to finding a reachable configuration (sj, Wi,di) of Va 
s.t. the Parikh image P of Wi is s.t. / ^ P (as the Ctg is encoded by the 
stack content Wi). To achieve this, we augment Va with a widget that works as 
follows. In any location of Va, we can jump non-deterministically to the widget. 
Then, the widget pops all the values from the stack, and checks that at least 
f{s) symbols s are present on the stack. The widget jumps to an accepting state 
iff it is the case. We call Vaj the resulting Pds. Clearly, one can build such 
a widget for all /, and this effectively reduces the Parikh coverability problem 
of Qdas to the location reachability problem of Pds. Moreover, for all /, the 
widget is of size exponential in l^j and exponential in the binary encoding of 
maxg^sfis)- Hence, building Vaj requires exponential time: 

Proposition 4 Given a synchronous Qdas A with states S and a function 
/ : 5 — )• N, then one can generate a Pds Vaj of size exponential in A and a 
state s of Vaj, s.t. Vaj reaches s iff f is Parikh coverable in A. 

As testing emptiness of a pushdown system without data is PTime-C [4], the 
Parikh coverability problem is in ExpTiME for synchronous Qdas (with both 
types of queues) . A matching lower bound is obtained by reducing the reachabil- 
ity question of Pds with data (see Proposition 1). This reduction requires only 
one concurrent queue, so the Parikh reachability problem is ExpTiME-hard for 
synchronous concurrent Qdas. Hence we derive the following: 
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Theorem 1 The Parikh coverability problem is ExpTlME-C for synchronous 

and for synchronous concurrent Qdas. 

Let us take a closer look on the dispatches that happen in runs of synchronous 
Qdas that have only serial queues. Here, each task except the main task blocks 
the queue it is started from. Hence, any other block dispatched to these already 
blocked queues dciadlocks. Thus, all reachable Ctg have at most \SQID\ + 2 
vertices. Hence, the pushdown systems used in all previous constructions have 
bounded stack height, and we can apply test on a finite transition system. The 
lower bound can be derived from Proposition 2. by testing the emptiness of the 
intersection of n finite processes, that is PSPACE-complete [14]. 

Theorem 2 The Parikh coverability problem is PSpace-C for serial synchronous 
Qdas. 

Concurrent asynchronous Qdas: Let us now establish a relationship be- 
tween concurrent asynchronous Qdas and Petri nets that proves that the Parikh 
coverability problem is ExpSPACE-complete. We first show how to reduce the 
Qdas Parikh coverability problem to the Petri net coverability problem. Given 
a concurrent asynchronous Qdas A, we construct a Petri net A''^ as follows: 
The places of are (A" x D) U S. Each place s G S counts how many blocks 
are currently running and are in state s. Each place {x, d) encodes the fact that 
variable x contains value d in the current valuation. Remark that we have no 
place to encode the contents of the queue, as the dispatch of block 7 directly 
creates a new token in s°. This encoding is, however, correct with respect to 
to the Parikh coverability problem, as Parikh(G) does not distinguish between a 
block 7 that is waiting in a queue, and a task executing 7 in its initial state. 
Thus: 

Proposition 5 For all concurrent asynchronous Qdas A with set of location 
S, we can build, in polynomial time, a Petri net N^^ s.t. f is Parikh- coverable 
in A iffm G Cover{NjCl, where m is the marking s.t. for all s G S: m{s) = f{s) 
and for allp G P \ S: m{p) = 0. 

Let us now reduce the Petri net coverability problem to the Qdas Parikh 

coverability problem. Let N = (P, T, mo) be a Petri net. We associate to N 
the concurrent asynchronous Qdas An = (CQ/I?, 0, F, main. A", S, (TiS.y)^gr)j 
on the finite domain D = {0, 1}, where CQID = {C}, F = {main, trans} U P, 
X = {vp I p € P} and {'TS~^)^f^r is given by the pseudo-code in Fig. 4 (this 
construction is an extension of a construction found in [10]). We assume that, 
for 7 e {trains, main} is the location of 7's Lts that is reached when the 
control reaches line £. Let G = {V, E, A, queue, state) be a Ctg for ^^v, and let 
m be a marking of TV. Then, we say that G encodes m, written G > m iff (i) 
Parikh(G)(si^^3) = Parikh(G)(sLin) = 1, for ah p G P: Parikh(G)(sO) = 
m{p) and (m) for all p G P, for all s G Sp \ {s°}: Parikh(G)(s) = 0. Thus, 
intuitively, a Ctg G encodes a marking m iff main is at line 8, trams is at line 
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1 def main C ) : 

2 for eachp^EP: 

3 : = 

4 select kp ^ {0^ . . . , mo(p)} 

5 fori=0...fcp: 

6 dispatch_a(C pO) 

7 dispat ch_a CC , transO) 

8 while (true ) : do nothing 



12 block transO: 

13 while (true ) : 

14 select t = (It.Ot) e T 

15 for each p ^ P s . t . It{p) — 1 : 

16 Vp := true 

17 while C3p G P : Vp — 1) : do nothing 
IB for each p G P s . t . Ot (p) — 1 : 

19 dispatch_a (C , pO) 



9 block p(): // For all pGP 



10 



while (Vp — 0) : do nothing 



11 Vp := 

Figure 4: Encoding of Petri net coverability {P, T, toq) by a Qdas 

14, m(p) counts the number of p blocks that are either in C or executing but 
at their initial state, and there are no p blocks that are in state s™'^ or s^™. 

The intuition behind the construction is as follows. Each run of the Qdas 
An starts with an initialization phase, where main initializes all the Vp variables 
to and dispatches, for all p P, kp blocks p with kp < mo(p), then dispatches 
a call to trans. At that point, the only possible action is that the scheduler 
dequeues all the blocks. All the p tasks are then blocked, as they need that 
Vp = 1 to proceed and terminate. Then, trsois cyclically picks a transition t, 
sets to 1 all the variables Vp s.t. t consumes a token in p, and waits that all the Vp 
variables return to 0. This can only happen because at least It{p) p tasks have 
terminated, for allp e P. So, when trans reaches line 19, the encoded marking 
has been decreased by at least If. Remark that more than It{p) p tasks could 
terminate, as they run concurrently, and the lines 11 and 12 do not execute 
atomically. Then, trans dispatches one new p block iff t produces a token in p. 
This increases the encoded marking by Ot, so the effect of one iteration of the 
main while loop of trans is to simulate the effect of t, plus a possible token 
loss. Hence, the resulting marking is guaranteed to be in Cover{N) (but maybe 
not in Reach{N)). This is formalized by the following proposition: 

Proposition 6 For all Petri nets N , we can build, in polynomial time, a con- 
current asynchronous Qdas An s.t. m £ Cover{N) iff there exists (G,d) € 
Reach(AN) with G t> m. 

Theorem 3 The Parikh coverability problem is ^XpSp ACE- complete for con- 
current asynchronous Qdas. 

Asynchronous Serial Qdas: Let us show that for the class of Qdas with 
one serial queue, and where asynchronous dispatches are allowed, the Parikh 
coverability problem is undecidable. We establish this by a reduction from the 
control-state reachability problem in a fifo system which is known to be unde- 
cidable [5]. 

Intuitively, we use the serial queue to model the unbounded, reliable fifo 
queue where sending a message m is encoded as asynchronously dispatching a 
block This block 7™ contains the control-flow of receiving m, i.e., that will 
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resume the fifo system's execution directly after receiving m. The fifo system's 
global state is guarded in a global variable. Receiving a certain message m is 
encoded as terminating the currently running task and assuring (via a global 
variable) that the succeeding task's type is the one of the expected message. 

Theorem 4 The Parikh coverability problem is undecidable for asynchronous 
Qdas with at least one serial queue. 

Concurrent Qdas: Let us show that, once we allow both synchronous and 

asynchronous dispatches in a concurrent Qdas, the Parikh coverability problem 
becomes undecidable. For that purpose, we reduce the reachability problem of 
two counter systems. 

The crux of the construction is the use of variables, i.e., global memory, to 
implement a rendez-vous synchronization. Given two distinct tasks, one can 
use their nested access to two lock variables to guard a shared data variable by 
assuring that a value written to the variable must be read before it is overwritten. 

Let us give the construction's intuition: Each counter is encoded similarly 
to the construction for synchronous Qdas as pushdown stack over a singleton 
alphabet, i.e., a sequence of nested synchronous dispatched blocks, these arc con- 
trolled via rendez-vous from the main task that in the beginning asynchronously 
dispatched the two counters. 

Theorem 5 The Parikh coverability problem is undecidable for concurrent Qdas 
that use both synchronous and asynchronous dispatches. 

Termination Problem: We use the previous constructions to directly lift 
the undecidability results from the Parikh coverability problem to the termi- 
nation problem. The close connection of synchronous Qdas with Pds (with 
data) allows to directly derive an ExpTime algorithm for the termination prob- 
lem from the emptiness testing of Biichi Pds [9]. Up to our knowledge, no 
completeness result is known for the latter problem, thus leaving a gap to the 
directly derivable PSPACE-hardness via finite systems. The result for asyn- 
chronous concurrent Qdas directly follows from Petri nets [15, 16]. 

Theorem 6 The termination problem is PSpace-C for synchronous serial Qdas, 
it is in ExpTime and PSvACE-hard for synchronous Qdas, and it is ExpSpace- 
C for asynchronous concurrent Qdas. It is undecidable for asynchronous serial 
Qdas, and Qdas that use both synchronous and asynchronous dispatches. 

5 Extending QDAS with Fork/ Join 

We return to the introductory matrix multiplication example. The crux of 
the algorithm is the parallel for-loop that forks a finite number of subtasks and 
waits for their termination (join). The latter had to be implemented via a global 
semaphore which (i) restricts the number of forkable tasks by the underlying 
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finite vahic domain, and (m) needs to be properly guarded by the programmer 
for access outside fork and join. In the following we thus want to extend Qdas 
by an explicit fork/join construct (which also exists in GCD). Further, the 
given matrix multiplication algorithm depended on an a priori fix size for the 
factor matrices, however, in practice, one wants to verify the algorithm for any 
possible (correct) input of any size. Thus, we need to consider the verification of 
extended Qdas where the number of forked tasks is parametrized by the input. 

As fork/join behaviour relies on asynchronously dispatching tasks on a con- 
current queue, we ignore in the following synchronous dispatches and serial 
queues, thus also partially avoiding the previous basic imdecidability results. 
Note that asynchronous concurrent Qdas can be regarded as over-approximations 
of all other classes of Qdas. 

QDAS extended by fork/join An Qdas extended by fork/join ^eQdas^ 
is a tuple {CQID, 0, F, main, X, E, {J'S~f)^QT) that is equivalent to a Qdas ex- 
cept that we replace in S the synchronous dispatch by the following action: 
{f orkjoin} x CQID x F x (N U {*}). The parameter of a f orkjoin action 
is the last value of the tuple. An eQdas is *-/ree if in all TS-y for 7 e F the 
parameter of the f orkjoin action is not *. 

The semantics of an eQdas is given analogous to standard Qdas as transi- 
tion system (C, c°, S, =^) where we additionally extend the transition relation 
given by tuples ((G, d), a, (G', d') by the following case: 

Fork/join: a — f orkjoin(g, 7,^) with p G (NU {*}), d' = d and there are 5 = 
(s, a, s') € A, and G" £ step(5(G)) such that: if p = * then we choose non- 
deterministically an n e N, else n = p, so that G' = G" where Gg = G" 
and for < i < n we define G'-^^ = letwait(w, -y ■^;^) (enqueue(g, 7i+i)(G")) 
where v is the node whose state has changed during the step operation, 
and vl_^_i is the fresh node that has been created by the enqueue operation. 

Intuitively, a f orkjoin action appends a sequence of blocks to a queue by 

additionally adding a wait edge to each newly create node. Hence, the join is 
modeled by a separate action that is taken by the scheduler after deleting the 
wait edges. 

The extended Parikh coverability problem, asks, given an eQdas A with loca- 
tions <S and a mapping / : 5 — >■ N, whether there exists c = (G, d) G Reach{A) 
with / ^ Parikh(G). The extended termination problem asks, given an eQdas 
A whether there is no infinite run possible in A. 

As f orkjoin actions with parameter 1 are semantically equivalent to a syn- 
chronous dispatch action, we can directly reduce the two counter machine sim- 
ulation from the proof of Theorem 5 to eQdas. 

Theorem 7 Both the extended Parikh coverability and extended termination 

problem, are undecidable. 

Consequently, we focus on two distinct over-approximations for eQdas in 
the following that allow us to give approximative answers to our verification 
problems. 
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*-free eQdas: Given an eQdas A that is *-frcc. Wc construct a Petri net 
by extending the previous construction from asynchronous concurrent Qdas to 
Petri nets as follows: As in the eQdas semantics we split a single forkjoin 
action of a block 7 on a queue q with parameter n G N into (i) a fork transition 
that creates n new tokens in s°, and (m) a subsequent join transition that 
depends on taking n tokens from the place representing f-y. Analogous to the 
proof of Proposition 5 we can show the following: 

Proposition 7 For all *-free eQdas with set of location S , we can build in 
polynomial time a Petri net st. f is Parikh-coverable in A ifm € Cover{N^), 
where m is the marking s.t. for all s G S: m{s) = f{s) and for all p G P \ S: 
m{p) = 0. Further, if terminates, then A is guaranteed to terminate. 

As coverability and termination arc decidablc for Petri nets, we can decide 
extended Parikh coverability and extended termination on this over-abstraction. 

eQdas with * parametrized fork/join: Given an eQdas A that is not *- 
free, we construct a Petri net N^^ as follows starting from the construction for 
asynchronous concurrent Qdas: For forkjoin actions whose parameter is not 

*, wc proceed as in the above construction for *-free eQdas. However, we need 
to model the forking of an arbitrary number of blocks when the parameter of 
the forkjoin action equals *. For this, we use Petri nets extended with w-arcs. 
An outgoing arc of a transition labeled with uj adds an arbitrary number of 
tokens to the corresponding place, thus, we translate the fork of block 7 into 
an w- transition leading to place s°. The join is approximated by a transition 
that non-deterministically chose to advance the original workflow, ignoring not 
already terminated forked tasks. Thus by extending the proof of Propositions: 

Proposition 8 For all eQdas with set of locaMon S. we can build in polyno- 
mial tim,e a Petri net st. f is Parikh-coverable in A if rn G Cover{N*n), 
where m is the marking s.t. for all s G S : m{s) = f{s) and for all j; G P \ S: 
m{p) = 0. Further, if N*^ terminates, then A is guaranteed to terminate. 

We have recently shown that the termination problem is decidable for Petri 
nets with w-arcs [12]. Hence, also extended termination is decidable on the 
previous abstraction. 

With respect to coverability, we can replace the w-arcs of N'^ by a non- 
deterministic loop that adds an arbitrary number of tokens to the original arc's 
target place. Note that this simple trick docs not work for verifying termination. 
Consequently, we can use the known algorithms for coverability on this poly- 
nomially larger standard Petri net, and hence the extended Parikh coverability 
problem is decidable on this abstraction. 

6 Conclusion & Outlook 

We introduce the, up to our knowledge, first formal model that grasps the core of 
GcD, and that allows to derive basic results on the decidability of verification 
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question thereupon. Due to the obvious undecidabihty issues of the model, 
we currently focus on several under- and over-approximative approaches (e.g., 
language bounded verification, graph minor based abstractions, novel Petri net 
extensions [12]) as well as enhancements for additional GCD features like task 
groups, priorities, and timer events. 
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A Proof for Section 2 



Proposition 1 The reachability problem is ExpTime-C for Pds with data. 

Proof. For the upper bound, wc generate a reachability-equivalent Pds (with- 
out data) by encoding all possible data valuations into the pushdown system's 
states. This leads to an exponential blowup of the state space. The lower bound 
can be derived from the reduction of the emptiness test of the intersection of a 
context-free language with n regular languages that is known to be ExpTime- 
hard (hardness follows easily by a reduction from linearly boimdcd alternating 
Turing machines; a closely related problem, the reachability of pushdown sys- 
tems with checkpoints, is shown to be ExpTiME-hard in (*). 

(*) Javier Esparza, Antom'n Kucera, and Stefan Schwoon: Model checking LTL 
with regular valuations for pushdown systems, in Information and Computation, 
186(2) :355-376, 2003. 

B Proofs of Section 4 

Synchronous Qdas: Let .4 be a synchronous Qdas with a set of locations 
S, a set of rules A, a set of final states F, and set of queues SQID. Let G be a 
Ctg of one of the forms given in Fig. 3, and let w = wqWx • • • be a word in 
S* . Then, G is encoded by w, written G>'w, iff for all < i < n: tw, = state{vij 
and the empty Ctg is mapped to the empty word e. 

Given a synchronous Qdas A = {CQID,SQID,T , main, X , S, (7~iS-y)^gr) 
with set of local states S as before, we build a pushdown system with data 
Va = {Y, X, y", S, Sp. Ap) where: 

• the sot of states is y = 5 U {e} and the initial state is y'^ = sj^^^^ 

• S-p = ({push, pop} X 5*) U {empty?} U guards (X) U assign {X) 

• a tuple (y, a, y') is a transition rule in A-p C y x E-p x K iff 

~ a G guards {X) U assign {X) and {y, a, y') e A 

a = push(s'), (s,dispatchs(g,7),s') € A and y' = s° 

a = pop(s), y G F and y' = s 

a = empty?, y = fmain, and y' = e. 

Thus, at all times, the current location of P4 encodes the current location of 
the (single) running block in A, and the stack content records the sequence of 
synchronous dispatches, as described above. A guard or assignment in A is kept 
as is in P4. A synchronous dispatch (s, dispatchs(g, 7), s') in A is simulated 
by a push of s' (to record the local state that has to be reached when the callee 
terminates) and moves the current state of P4 to the initial state of 7. The 
termination of a block is simulated by a pop (and we use the empty? action for 
the termination of main). 
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Proposition 3 Given a synchronous Qdas A, then we can construct a push- 
down system with data Va such that the following holds: for any run p = 
coaici . . . a„c„ of A, there exists a run w = xqUiXi . . . UnXn in Va such that for 
all Ci = {Gi,di) and Xi = (sj, Wi, d^) we have di = and Gi>Wi (0 < i < n), 

and, vice versa. 

Proof. We assert that the semantics of is the usual semantics for push- 
down systems with data, i.e., an infinite transition system with configurations 
c = {y,w,d) G Y X S* X D*^. Thus, we can interpret configurations also as 
follows: (x, d) e S* X D"^ with x = w ■ y e S* ■ {S U {e} . 

Let (G, d) e Reach{A) be reachable by a run (Go, (io)ai(Gi, di)a2 ■ ■ ■ an{Gn, dn)- 
Then we can induce a run {xo,do)ai{xi,dx)a2 ■ ■ ■ an{xn,dn) in Va such that 
di = di and Gi t> Xi for < i < n. 

By construction of 7^^, a;ol>Go and cIq — do. Wo now assume that there exists 
a prefix of the Qdas's nm of length < j < n of the form (Go, c?o) • • • (Gj, dj) 

such that there exists a run of the pushdown system [xQ^do) . . . {Xj,dj) that 
fullfills the induction hypothesis. We now consider the outcome of a Qdas 
transition labeled Oj+i. We know that Gj must be a path of vertices vq. ..Vn 

connected by wait edges. 

Sync, dispatch: dispatching a block 7 on queue q leads to (Gj-|_i, rfj+i) with 
dj = dj+i and Gj+i is a path graph vqV\ . . . VnVn+i with new distinct 
vertex Vn+i where state{vn+i) = v^. We mapped the dispatch rule to 
a push of the current state to the pushdown and jumping to the new 

initial state, i.e., we go from {xj,dj) to (xj+i, dj+i) where dj = dj+i and 
Xj+i = Xj • s°. Obviously, Gj+i >Xj^i. 

Test /Assignment: Gj+i equals Gj except for statej{vn) = s and statej-^i{vn) = 
s' and a possible change of dj+i according to the underlying data action. 

Executing the same action on Va assures that dj+i = dj+i and changing 
the control state of the pushdown only changes Xj = w-s to Xj^i = w-s'; 
thus, Gj+i > Xj+i. 

Termination: To apply the action Gj consists of a (non-empty) path ending 
in V with statej{v) € F and Gj+i = Gj \ v, and dj = dj^i. Note that 

Gj+i could be possibly empty. Given a {xj, dj) according to the induction 

hypothesis, then we have to consider two cases: cither Xj = Wj ■ yj with 
Wj € 5+ and yj G S (i.e., there is at least one element on the stack), or 
Xj = yj G S (i.e., stack is empty). In the second case, we know that Xj G 
Smain ^^d by the induction hypothesis, that Xj = s'^^^^ and Gj a path of 
length 1. Now, P^. takes the empty? transition leading to the (bottom) 
state e, i.e., Xj+i = e, hence Gj+i is empty and Gj+i >s. If the stack is 
not empty, then we can take a poptransition such that Xj+i = w G for 

Xj = w ■ s, hence Gj+i > Xj. Obviously dj+i = dj = dj = dj+i. 
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(Recall that we asserted dispatch and scheduling/dcqueucing to be atomic, so 
we do not need to consider other actions of the scheduler.) 

The reverse direction follows analogously as the previous inductive construc- 
tion used necessary sufficient steps. □ 

Proposition 3 Given a synchronous Qdas A, then we can construct a push- 
down system with data Va such that the following holds: for any run p = 
CofliCi . . . a„c„ of A, there exists a run tt = x^aiXi . . . UnXn in Va such that for 
all Ci = {Gi,di) and Xi = {si,un,d\) we have di = d'^ and G, t> (t) < i < n), 
and vice versa. 

Proof. We assert that the semantics of Va is the usual semantics for push- 
down systems with data, i.e., an infinite transition system with configurations 
c = {y,w,d) G Y X S* X Thus, we can interpret configurations also as 
follows: (x, d)€S* X J}^ with x = w- y € S* ■ (S U {e}. 

Let (G, d) € Reach{A) be reachable by a run (Go, (io)ai(Gi, di)a2 . . . a„(G„, dn). 

Then we can induce a run [xQ,dQ)ai{xi,d\)a2 . . .an{xn,dn) in Va such that 

di = di and Gj > Xi for < i < n. 

By construction of Vai a;oi>Go and do — d^. We now assume that there exists 
a prefix of the Qdas's run of length < j < n of the form (Gq, do) . . . {Gj,dj) 

such that there exists a run of the pushdown system (.To, do) • • • (^^jjdj) that 
fuUfills the induction hypothesis. We now consider the outcome of a Qdas 
transition labeled aj+i. We know that Gj must be a path of vertices . . . w„ 
connected by wait edges. 

Sync, dispatch: dispatching a block 7 on queue q leads to (G^+i, dj+i) with 

dj = dj+i and Gj+i is a path graph vqVi . . . w„i',i+i with new distinct 
vertex Vn+i where state{Vn+i) = v!^. We mapped the dispatch rule to 
a push of the current state to the pushdown and jumping to the new 

initial state, i.e., we go from {xj,dj) to {xj+i,dj+i) where dj = dj+i and 
Xj+i — Xj ■ s°. Obviously, Gj+i i>a;j+i. 

Test /Assignment: Gj+i equals Gj except for statej(vn) = s and statej+i{vn) = 
s' and a possible change of dj+i according to the underlying data action. 

Executing the same action on Va assures that dj+i = dj+i and changing 
the control state of the pushdown only changes Xj = w-s to Xj+i = w-s'; 
thus, Gj+i i> Xj+i. 

Termination: To apply the action Gj consists of a (non-empty) path ending 
in V with statej{v) G F and Gj+i — Gj \ v, and dj ~ dj+i. Note that 

Gj_|_i could be possibly empty. Given a {xj, dj) according to the induction 
hypothesis, then we have to consider two cases: either Xj = W j ■ yj with 
Wj G S'^ and yj G S (i.e., there is at least one element on the stack), or 
Xj = yj £ S (i.e., stack is empty). In the second case, we know that Xj £ 
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Smain and by the induction hypothesis, that Xj = s^„j„ and Gj a path of 
length 1. Now, takes the empty? transition leading to the (bottom) 
state £, i.e., Xj+i = e, hence Gj+i is empty and Gj+i >e. If the stack is 
not empty, then we can take a poptransition such that Xj^i = w G for 

Xj = w ■ s, hence Gj+i > xj. Obviously dj+i = dj = dj = dj+i. 

(Recall that wc asserted dispatch and schcduling/dcqucucing to be atomic, so 
we do not need to consider other actions of the scheduler.) 

The reverse direction follows analogously as the previous inductive construc- 
tion used necessary sufficient steps. □ 

Lemma 1 Given a finite set S and a function / : 5 — )■ N, then there exists a 
finite automaton Tf with alphabet S of size exponential in \S\ and polynomial 
in (in the binary encoding of) maxsesfis) such that C{Tf) = {w € S* : \w\g > 
f{s) for allse S). 

Proof. Given a set S and a function f : S ^'H. Let k = maxs^sfis) (which 
must exists as S is finite). Then J^f is the finite automaton (Q, 5, , A, q^) with 
states Q — S X {0 . . .k} (interpreted as an ^-indexed vector of values in ... fc), 
an action alphabet S, the initial state is 5° where q°{s) = f{s), the finial state is 
qf where q-^ (s) = 0. The transitions oi are defined as follows: {q,s,q') G A 
iff q'{s) — q{s) — 1 for q{s) > 1, else q'{s) = q{s), and for all t S {s} we have 
q'{t) = q{t). Thus each transition labeled by an action s reduces the "counter" 
q{s) by one until zero and once arrived at zero, the counter q(s) remains zero 
for any further s action. Further, the control structure of J^f is acyclic (except 
for the loops at q^), thus each run can visit each state in Q \ {q^}. 

If w = ai . . . a„ € C{A) then it was accepted by a run qaoiqi . . . a„qn where 
go = q^ and qn = q^ ■ Due to our construction of A, it holds for w = ai . . . a„ 
that \w\s > qois) = f{s) for all s E S. li w ^ ^i-^) then there exists a run 
qoOiqi . . . Onqn where q^ = q^ and for g,, 7^ g' it holds that there exists at 
least one s e S' such that qn{s) > 0, each transition {qi-i,ai,qi) assures that 
Qi-i{s) > 5i(s), hence |w|s < f{s) for at least one s G S. □ 

Proposition 4 Given a synchronous Qdas A with states S and a function 

/ : S — >■ N, then one can generate a Pds V^j of size exponential in A and a 
state s ofVAj, s.t. Vaj reaches s iff f is Parikh coverable in A. 

Proof. [Prop. 4] First, we construct the Pds with data and states S 
as mentioned before. Then, we translate the Pds with data to a bisimilar 
Pds without data V_a. = :'y"; "l^: A) by encoding all possible valuations 
of variables into the Pds's states by the standard product construction, i.e., 
Y = S X {X xD). Given y € Y, let S{y) G S denote the original state com- 
ponent. Note: Pj^ is at most exponentially larger as and this construction 
does not change the pushdown system's behaviour with respect to the stack but 
only internal actions. 

Second, from the function /, we construct the automaton = {Q, S, q^, A^, q^) 
analogous to Lemma 1. 
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Finally, we define the Pds Vaj = {Y, y°, ^, S, j) as follows 

• states are F = F U Q (assuring disjointness by relabeling when necessary) 

• iP = is the initial state 

• $ = $ is the stack alphabet (where $ = 5 due to the above construction) 

• S = S U {e} 

• a tuple (y, a, y') is a rule in j C F x E x F iff one of the following holds 

• {y,a,y') E A (include all transition rules ofV_A.); 

• a = pop(s) for s e 3> and {q, s, q') G A^ (include rules of J^/ and change 
an s action to pop(s) for s e 5); 

• y € Y , a = push(z) for z = S{y), and y' = q^ (connect all states in Y 
with the initial state of J^f, additionally stocking the current "state" - 
component on the stack). 

Note that Paj is of size exponential with respect to both the Qdas and / due 
to serial composition. 

We now have to show that if there is a run in Vaj that reaches the state 
q-^ , then there exists configuration c = (G, d) of A such that / ^ Parikh(G). 

Assert that there exists a run of Vaj reaching q^ , then it must be of the 
following form (xq, fli, xi, . . . , a^, Xfe, Ofe+i, .Tfc+i, afe+2, • • • , a;„) where = 
{Vi^Wi) C Y X S* are the corresponding infinite transition systems configu- 
rations. Further, yo = y°, yn = q^ , yk+i = Q^, and {yi . . .yk) is a subrun that 
only uses states in Y as well as transitions in Pa', {Vk+i, ■ ■ ■ ,yn} C Q and the 
corresponding transitions are derived from Ajr, as well as ak+i = push(5(yfe)). 

Let us take a closer look on the first part of the run: (yo,ai, • • • ,an,Xk) 
is equivalent to a run of Va that reaches a configuration Xk- The latter is, 
following Propositions 3 and ??, similar to a run of the original Qdas A that 
reaches a configuration c = {G, d) where G>yk- S{yk)- Thus, c G Reach{A). 

The transition {xk,pViSlD.{S{yk)),Xk+i) now transfers the encoding of G to 
the stack, i.e., Wk+i = yk ■ S{yk)- AH other information on data encoded in yk 
is lost in this step. 

Now, by Lemma 1 we know that the subrun {xk+i, afc+2, • ■ • , On, Xn) leading 
to the final state of Tf assures that > f{s) for all s E S. Hence, for the 

previously found c = {G,d) G Reach{A) it holds that / ^ Parikh(G). □ 

Let us take a closer look on the dispatches that happen in runs of synchronous 
Qdas that have only serial queues. Assume a run of such a Qdas, and suppose 
the first dispatch performed along this run (by main) is dispatchs((7, 7). As 
the dispatch is synchronous, main is blocked, and the scheduler has to dequeue 
7 to let the system progress. Cleraly, if 7 performs a synchronous dispatch 
dispatchs(g, 7') to the same queue q, we reach a deadlock. Indeed, the task 
running 7 is blocked by the synchronous dispatch of 7', but we need to wait for 
the termination of 7 to be able to dequeue 7' from q (because q is serial). So, 7 
has to dispatch its blocks to other queues. For the same reason, we also reach 
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1 global state : = ; 

2 global c_ queue q 



.0 



4 def 0() : // for each. G ^U{main} 



7 



while (true ) ; 

select (s,a,s') G A-p where state=s 



Ctg in Reach{Av)- 



11 



10 



if a — push((/)') : 
state := s' 
dispatch_s Cq,0') 



15 



13 



14 



if a — pop(0) and (j) — cj)' 
state := s' 
terminate 



Figure 5: From a pushdown system to a Qdas: main and for 
a deadlock if a block called by 7 performs a synchronous dispatch into q. We 
conclude that, in all reachable Ctg, the following holds for all queues: either the 
queue contains one block and there is no running task from this queue, or the 
queue is empty, and there is at most one running task from this queue. Hence, 
all the reachable Ctg have at most \SQID\ + 2 vertices. Thus, the pushdown 
systems used in all previous constructions have bounded stack height and we can 
apply the emptiness test on a finite state system when proving Proposition 4. 
The lower bound can be derived from Proposition 2. Thus we can derive: 

Proposition 4 Given a synchronous Qdas A with states S and a function 
f : S ^N, then one can generate a Pds Vaj of size exponential in A and a 
state s of Vaj , s.t. Vaj reaches s iff f is Parikh coverable in A. 

B.0.1 Prom Pds to Qdas 

Given a Pds V, we construct a synchronous Qdas .473 as shown in Figure 5. The 
underlying idea is the inverse of the above simulation: we map a pushaction of a 
letter (p to synchronous dispatch call of a block (p and simulate the stack contents 
in the Ctg such that we can only map a popaction to a task's termination if 
we match the topmost letter of the stack, encoded in the block name. 

The control state of the Pds is stored in the variable state and the behaviour 
of the control structure of V is encoded as non-determinstic choice (line 7) 
that assures that reaching the dispatch and termination actions (lines 11/15) 
demands that the selected transition rule harmonizes with the current change 
of the variable state from s to s' and that a push((/i') action is only possible if 
the currently running task is labeled by the blockname 0' (line 13). 

A reachable configuration of A-p is given by (G, d) where G is — as discussed 
before — a path of vertices vqVi . . .Vk- As before, synchronous dispatch calls 
assure there is no more than one task active at the same time. Given c = 
(G, d) G Reach{A-p) and a configuration ?/ = (x, w) € X x $* that is reachable in 
V] then c is represented by y, written Oy, iff c?(state) — x and for w = wi . . . Wk 
X{vi) — Wi ioY 1 < i < k and X{vo) — main. Hence, the state of the Pds is 
stored in the variable state, and the path vi . . .Vk encodes in the underlying 
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task's blocks the stack content, where the empty stack is represented by a single 

vertex labeled by main. 

Proposition 9 Given a pushdown system V , then we can generate a syn- 
chronous Qdas A-p such that the following holds: for any rurnr = yo«i2/i ■ • • anVr 
in V there exists a run p = coflici . . .a„c„ of A-p such that for all Ci > Xi 
(0 < i < n), and vice versa. 

Proof. Given a run p = j/o^iyi . . . akVk of the Pds P. W.l.o.g. let us consider 
in the following underlying sequence of configurations and fired transition rules 

yoSiyi . . . Skyk where Si = (a;, , Cli , Xj^ ) e Ap for 1 < i < fc. 

We show inductively how Ap generates a run that simulates p. 

For the initial configuration oi V y^ = (a;",e) and the initial configuration 
Co = (G, d) with G consists of a single node vq with A(i;o) = main and d{state) = 
x'^ it holds that cq i> i/o- 

Now assert that the Pds V reached configuration yi (0 < i < k) such that Ap 
simulated the prefix of the run until q = {Gi,di) with Cit>yi. Assert that G, is 
a path vqVi . . .vi. We do a case-by-case analysis with respect to = (x, a, x') 
that leads to 

• only the task corresponding to Vi is active and the only way to exit its while 
loop is via the lines 11 and 15, that assure that line 7 selected 8 = {x, a, x') € 
Ap with di{state) = x, and that we set di+i{state) = x';; 

• if a = push(^) for then we fire the synchronous dispatch that leads 
to Gi+i = Wo • --vivi+i with A(w;+i) = 0, thus {Gi+i,di+i) >yi+i; 

• if a = pop(0) for (/) G $ and we left the while loop then X{vi) = (by line 
13), and Gj+i equals vq... vi-i, thus (Gj+i, dj+i) > yi+i. 

The reverse direction follows analogously by considering lines 10; 11 and 
14; 15 as atomic actions (i.e., setting the state variable and changing the call 
graph of the Qdas). 

B.l Asynchronous Concurrent Qdas 

Proposition 5 For all concurrent asynchronous Qdas A with set of location 
S, we can build, in polynomial time, a, Petri net Nj^_ s.t. f is Parikh-coverable 
in A iffm S Cover{NjC), where m is the marking s.t. for all s G S: m(s) — /(s) 
and for allp G P \ S: m{p) = 0. 

The proof of the proposition relies on the following lemma, showing that 

can simulate precisely the sequence of Parikh images that are reachable in 
A. Let {G,d) be a configuration of A, and let m be marking of Nj^. We say 
that m encodes (G, d), written m t> (G, d) iff: (i) for all x G X: m{x, d{x)) = 1, 
(ii) for aU x £ X: for aU d e D \ {d{x)}: m{x,d) = and {Hi) for all s € 5 
m(s) = Parikh(G)(s). Then: 
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Lemma 2 Let A be a concurrent asynchronous Qdas with set of variables X 
and set of locations S, and let Nj^ be its associated Pn. Then, for all (G, d) G 
Reach{A) there is m G Reach{NjCj s.t. m > (G,d) and for all m G Reach{N^), 
there is {G,d) G Reach{A) s.t. mt> {G,d). 

Proof. We prove the two statements separately. 

Let (G, d) be a configuration in Reach{AN), and let (Go, (io)ao(Gi, fl!i)ai • • • a„_i(G„, 
be a run s.t. (G, d) = (G„, Let us build, inductively, a run TOoWi • • • of 
s.t. ruk > (G, d). The induction is on the length n of the Qdas run. 

Base case n = 0. It is easy to check that mo > (Go, do). 

Inductive case n = £. Let us assume that mpmi • • • nij is a run of Nj^ s.t. 
rrij > {Gi-i,de-i), and let us show how to complete it, if needed. We consider 
several case depending on a„_i. In the case where a„_i = e and the scheduler 
action consists in dequeueing a block from a queue, we have Parikh(G^_i) = 
Parikh(G£) and de = d^-i. By induction hypothesis mj O (G^_i,d£_i), hence 
rrij [> {Ge, dg), and we do not add elements to the run built so far. In the case 
where a£_i = dispatcha(7, (7), we assume (s, a£_i,s') G A is the corresponding 
Lts transition. Clearly, Parikh(G£)(s') = Parikh(G£_i)(s') + 1, Parikh(G^)(s) = 
Pankh(G£_i)(s) - 1, Parikh(G£)(sO) = Parikh(G£_i)(sO) + 1 and for all other 
location s: Parikh(G£)(s) = Parikh(G^_i)(s). It is easy to chock that the Pn 
transition t s.t. I{t){p) = 1 iff p = s and 0{t){p) = 1 iff p G {.s',s"} is fireable 
from rrij (as mj O {Gi-i,de-i) by induction hypothesis) and yields the same 

effect, i.e. the marking m with ruj A m is s.t. m l> (G^, d(). All the other cases 
(test, assignment and task termination) are treated similarly. 

Now, let m,Qmi ■ ■ ■ nin be a run of and let us build, inductively, a run 
{Go,do)ao{Gi,di)ai ■ ■ ■ ak-i{Gk,dk) s.t. m„ > {Gk,dk) and all the queues are 
empty in G„. The induction is on the length n of the Pn run. 

Base case n = 0. It is easy to check that mo \> (Go, do). 

Inductive case n = £. Let us assume that {Go,do)ao ■ ■ ■ aj^i(Gj,dj) is 
a run of A s.t. m^_i [> {Gj,dj) and all the queues are empty in Gj. Let t 

be the Pn transition s.t. m^_i A me and let us show how we can extend 
the run of A. We consider several cases. If i is a transition that corresponds 
to an asynchronous dispatch, then there are s, s' , 7 and q s.t. It{p) = 1 iff 
p = s and Ot{p) = 1 iS p G {s' , s?^} . By definition of A/'^, there is a transition 
(s, dispatcha(7, g), s') in A. Moreover, m,£_i(,s) > 1, since t is fireable from 
mi-i. As mc_i > (Gj^dj), the (s, dispatcha(7, g), s') is fireable from {Gj,dj), 
and leads to a configuration (Gj+i, dj+i), where a 7 block has been enqueued 
in q, hence dj+i = dj, Parikh(Gj+i)(s) = Parikh(Gj)(s) — 1, Parikh(Gj+i)(s') = 
Parikh(Gj)(s') + 1, Parikh(Gj+i)(sO) = Parikh(Gj)(sO) + 1 and for all other 
state s": Parikh(Gj+i)(s") = Parikh(Gj)(s"). It is easy to check that m^ [> 
(Gj+i,dj+i), however, queue q contains a call to 7 in Gj+i and is thus the 
only non-empty queue in this Ctg. Thus, from (G^+i, dj+i), we execute the 
scheduler action that dequeues from q. This has no effect on the Parikh image 
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Figure 6: The Lts of bloc p. 



of the Ctg. Thus, we reach (Gj+2,(ij+2) s.t. dj^i = dj^2, Parikh(Gj_|_i) = 
Parikh(Gj+2), hence m£\>{G dj+2) too, and all the queues are empty in Gj+2, 
which concludes the induction step. All the other cases arc treated similarly. □ 
We can now prove Proposition 5: Proof. It is easy to check that the 
construction of Nj(, as described above, is polynomial. Then, assume / is 
Parikh covcrablc in A, i.e. there is {G,d) S Reach(A) s.t. / ^ Parikh(G). By 
Lemma 2, there is m' G Reach{Nj\) s.t. m' > {G,d). Hence, for all s G S: 
m'{s) = Parikh(G)(s). So, for all s G S: m(s) = /(s) < Parikh(G)(s) = m'{s). 
Hence, m < m' (as m{p) — for all p ^ S). Since m' G Reach{Nji), wc 
conclude that m G Cover{Nj\). On the other hand, assume m € Cover{NjCj, 
with m{p) = for all p ^ S, and let / be s.t. for all s G S: f{s) = m(s). 
Since m e Cover{Nj{), there is m' G Reach{Nj() s.t. m -< m' . By Lemma 2, 
there is (G,c?) e Reach{A) s.t. m' t> {G,d). Thus, by definition of [>, for all 
s £ S: m'{s) — Parikh(G)(s). Thus, since m < m' and by definition of /, we 
conclude that for all s G S: f{s) = m{s) < m'{s) = Parikh (G)(s). Hence, / is 
Parikh-coverable in A. □ 

Proposition 6 For all Petri nets N, we can build, in polynomial time, a con- 
current asynchronous Qdas An s.t. m € Cover{N) iff there exists {G,d) e 
Reach{AN) with G\>m. 

The proof of Proposition 6 is split into two lemmata, given hereunder. They 
rely on an alternate characterization of Cover{N). That is, m e Cover{N) 
iff m is reachable by a so-called lossy run of A'', i.e. a sequence of markings 
mQTn'i ■ ■ ■ m'^ s.t. m'^ < mo and for all < i < n — 1: there is rfii+x and 
a transition ti s.t. m\ Wi+i and m[^^ < rfii+i. Intuitively, a lossy run 
corresponds to firing a transition of the PN, and then spontaneously losing 
some tokens. The proof of these lemmata also assumes that each p & P, the 
Lts TSp = {{s%, s™*'', s^''"}, s°, S, ^) is as depicted in Fig. 6. 

Lemma 3 LetN = (P,T, mo) 6e a Pn, and let An = {GQID,%,T , main. A", S, ['TSj)^^r) 
be its corresponding Qdas. If m G Cover{N) then there exists {G,d) € 
Reach{AN) s.t. G\>m. 

Proof. Let m be a marking from Cover{N). and let mom'i---TO^ be a 
lossy Pn run s.t. m = m„. The proof is by induction on the length of 
the run. More precisely, wc show that, for all < i < n, there is a reach- 
able configuration {Gi,di) e Reach{AN) s.t.: for all p G P: di{vp) = 0, 
Gj = (V*, E\ X\ queued state^), Gi>m,m<m^ and E'^ = 0. 
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Base case: jtiq. Let us consider the run of that consists in: (a) ex- 
ecuting block main up to line 8, then (6) emptying the queue C. The exe- 
cution of (a) has the effect that: (i) all Vp variables are initialized to and 
keep this value, (ii) for all place p: at most niQ^p) copies of block p are asyn- 
chronously dispatched in queue C and (iii) one copy of block treins is dis- 
patched in C. Then, the execution of (6) creates one running task for each 
block that is present in C. Thus, the execution of (a) followed by (&) reaches 
a configuration {Go, do) with Go = {V° = l:)VS, E" , X° , queue° , state°) s.t. 
Vq — 9 (the queue has been emptied), for all p: \{v £ V^\ X{v) = pj- 1 = mo{p), 
\{v gV^I X{v) = trans} | = 1 and = % (the queue is empty and all the 
calls are asynchronous). Moreover, state is such that each task running a p 
block is still in its initial state s^, hence Go > mo- Similarly, the task running 
the trains block is about to enter the while loop at line 14. Finally, as the 
variables have been initialized to and not modified, we have do{vp) = for all 
peP. 

Inductive case: rrii Let us assume there exist (Gi_i,di_i) € Reach{AN) 
that respects all the conditions given at the beginning of the proof (in particular 
Gi_i >TOj_i). Let ti and be the Pn transition and marking s.t. mj_i -4- mj 
and m.i ^ mi and let us show that can simulate it. This is achieved by the 
following sequence of actions in An- First, the block executing trans enters 
the while loop at line 14 and selects ti as transition t. Then, it sets all the 
variables Vp s.t. Il, {p) = 1 to 1. Thus, at that point Vp contains 1 iff Itiip) = 1, 
since all Vp variables were equal to by induction hypothesis. Then, the task 
executing trans is blocked as it need to wait up to the point were all Vp are 
equal to 0. Since Gi-i > m^-i by induction hypothesis, we know that there are, 
in Gj_i, mi-i{p) tasks executing block p, for all p £ P. However, ti is fireable 
from TOj-i, and a loss of rfii — rrii token is still possible after the firing. Hence, 
mi-i{p) > {It-{p) + ffiiip) — mi{p)) for all p. Thus, for all p, there is at least 
{Itiip) +n%i{p) — mi{p)) tasks executing p in Gi-i. Thus, we complete the run 
of ^jv by letting, for all p, {Iti{p) + rni{p) — mi{p)) p task execute lines 11 in 
turn one after the other. Then, letting them all execute line 12, and reach their 
final state (Remark that all the p task must first execute line 11 before one of 
them can execute line 12, as this sets Vp to and would prevent other tasks 
to execute line 11). This is possible because none of those tasks are blocked, 
since the Ctg contains no edge, by induction hypothesis. At that point, ^jv 
has reached a configuration (G', d') s.t. d'{vp) = for all p e P (by line 12) 
and where G' > rrii-i — (/*; + — rrii). Moreover, G' still respects all the other 
hypothesis as no new dispatch have been performed. Then, the simulation of 
ti proceeds by letting the trans task finish the current iteration of the main 
while loop. This consists in executing the for loop of line 19, which dispatches 
one p block in C iff Ot^ip) = 1, i.e., the effect of ti is to add a token to p. 
Finally, the scheduler empties queue C and creates tasks for all the blocks that 
have just been added to C. It also kills all the p tasks that have reached their 
final state. As a consequence, the configuration that is reached is (Gj, di), where 
Gi>mi_i-{It^+mi-m,)+Oti = {mi_i-It^+Ot.)-mi+mi = Tni-mi+mi =mi 
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and di is s.t. di{vp) = for all p E P. Moreover, sinee the queue has been 
emptied by the scheduler, Gi contains only task nodes and no edge, as all the 
calls are asynchronous. The task executing trans is still active and at line 14, 
and all the p tasks are in their initial state. □ 

Lemma 4 LetN = {P, T, mo) be a Pn, and let An = {CQID, 0, T, main, X, S, (T S^)^^r) 
be its corresponding Qdas. If there are {G,d) G Reach{AN) and m s.t. Gt>m 

then m{G) G Cover(N). 

Proof. For a Ctg G of ^jv with set of vertices V, we denote by M{G) the 
marking of N s.t. for all p £ P: M{G){p) = \{v€V\ state(v) = sO} |. Thus, 
in the case where G encodes a configuration s.t. trans is at line 14, main is at 
line 8, and all the p blocks are in their initial state, then G \> M{G). 

In order to establish the lemma, we prove a stronger statement: every time 
we reach, along a run, a configuration (G, d) s.t. trans is at line 14, then 
M{G) e Cover{N). Formally, let p = (Go, rfo)ao(Gi, di)ai(G2, ^2) • • • (G„, £?„) 
be a run of An, where, for all i < n: Gi ^ (Vi, Ai, queue^, statei). Let 
77 : {0, . . . , /c} — > {0, . . . , n} be the monotonically increasing function s.t. k < n 
and for all < .7 < n: there exists v E Vj with statei{v) = sl"^^^^ iff there is 
< £ < k with k = Tr{£). That is the sequence 7r(l),7r(2), . . . ,n{k) identifies 
the indexes of all the configurations of the run where trans is at line 14. Let 
us show, by induction on i that all the M(G7r(j))'s are reachable in the lossy 
semantics of N. 

Base case i = Let us show that M(Gt(o)) = ™0; i-C-, that the first time 
trans reaches line 14, M(G„(o)) is the initial marking of N. Observe that the 
prefix of the run must have the following form. Initially, only the main block is 
executing: it first sets all the variables Vp to 0, then dispatches asynchronously 
at most mo{p) calls to each p block (for all p £ P), then finally dispatches 
an asynchronous call to traois and reaches line 8. Along this execution, the 
scheduler might decide to pick up some p blocks from G. However, as long as 
the scheduler has not scheduled the call to trans, the Ctg met along the run 
do not encode any marking, by definition of t>. When the scheduler starts a 
task to run the trans block, we thus reach a configuration (G, d) where: (i) the 
queue G is empty, as dequeueing the trans block is possible only if all the p 
blocks have been dequeued, and no other dispatch has been performed; (ii) all 
the p tasks are blocked in their initial state as d{vp) = for all p G P; and (iii) 
main is still blocked in the infinite loop at line 8. Since the scheduler has just 
dequeued trans from G, G is necessarily the first Ctg to encode a marking, so 
G = G^(o) • Moreover, by the loop at line 4, it is clear that G [> m with m < mg. 

Inductive case i ~ £ > 1 The induction hypothesis is that M(G7r(i_i) G 
Cover{N). Let us consider the p' — (G„(£_i), (i7r(£_i)) • • • (G7r(£), cJt(^))j i-e. the 
portion of p that allows to reach (G7r(£), rf^(f)) from (G^(^_i), We con- 

sider two cases: 

1. Either trans has not performed an iteration of its main while loop along 
p'. In this case, the only actions that can occur along p' are scheduler 
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actions consisting in dequcucing p blocks or the termination of some p 
tasks that where still in state s™*'^. In both cases, this does not modify 
the value of M(G), so M(G^(i)) = M(G^(i_i) e Cover{N). 

2. Or trans has performed a complete iteration of its main while loop pos- 
sibly interleaved with the dequeue of p blocks and the termination of 
p tasks. Since the dequeues and terminations have no influence on the 
value of M{G) as argued above, let us focus on the effect of executing 
one iteration of the while loop. The iteration first selects a Pn tran- 
sition t and sets all the variables Vp s.t. It{p) = 1 to 1. The reached 
configuration is then {G,d) where M{G) = M{G^(i_i)), as these opera- 
tions do not manipulate p blocks or tasks. Then, trams is blocked by the 
test at line 18. As only p blocks can set Vp variables to 0, we are sure 
that, when treins reaches line 19, at least It{p) p blocks have left their 
initial state, for all p € P. Thus, when trans is at line 19, the config- 
uration is (G'.d'), where for all p e P: M{G'){p) < M{G){p) - It{p) = 
-M(G'^(j_i)) — It{p)- Afterwards, trans terminates the iteration of the 
while loop by dispatching Ot{p) p blocks for all p & P, and reaches line 
14, which finishes p' . Hence, we reach (G7r(i), rf7r(i)), where for all p € P: 
M{G^(^){p) < M{G^^i_,))-It{p) + Ot{p). Since M(G,(i_i)) e Cover{N) 
by induction hypothesis, we conclude that M{Gj^^i')) G Cover{N) too. □ 

B.2 Asynchronous Serial Qdas 

We establish the undecidability for asynchronous serial Qdas by a reduction 
from the control-state reachability problem in a fifo system. Let F = {Sp, s'^,M,A p) 
bo a fifo system and let c € Sp be a control state whose reachability has to be 
tested. We build the asynchronous serial Qdas Ap = (0, {q}, F, main, X, S, {TSj)j^r) 
on domain B = M \J Sp U {s}, where T = M U {e, main}, X = {state, head} 
and the TS^ are given by the pseudo code in Fig. 7. 

Intuitively, runs of Ap simulate the runs of F, by encoding the current state 
of F in variable state and the content of F's queue into the content of the serial 
queue q. More precisely, it easy to check that, once main has reached line 8, 
all the Ctg that are reached in Ap are of either shapes depicted in Fig. 7, for 
{mi, . . . ,m„,m} C M U {e}. That is, there are at most two running tasks: 
main and possibly one task running a m block (for m € M U {e}), that has to 
terminate to allow a further dequeue from q. This is because q is a serial queue 
and all the dispatches are asynchronous. When the Ctg is of shape (b), the 
duty of the running m block is to simulate a run of F. It runs an infinite while 
loop (line 11 onwards - ignore the test at line 10 for the moment), that (i) tests 
whether c has been reached (line 12) and jumps to line 20 if it is the case; (ii) 
guesses a transition (s, a, ,s') of F; and (in) checks that the guessed transition is 
indeed fireable from the current configuration of F, and, if yes, simulate it. This 
consists in, first testing that s is the current state (line 14). If not, the block 
jumps to the infinite loop of line 19, which ends the simulation. Otherwise, the 
current state is update to s' , and the channel operation is then simulated. A 
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1 global state , head 

2 global s_ queue q 

4 def main C) : 

5 state := 

6 head := e 

7 dispatch_a(q, e) 

8 while (true) : do nothing 



Note that the rcaehability of a state c 
of the fifo system is explicitely coded 
into the control structure. 




def mC): //for all m £ M U {e} 
if Chead^^ m) : goto 20 
while (true ) : 

if (state = c) : goto 21 
select (s,a,s') ^ Ay 
if (s f^state) : goto 20 
state := s 

if (a — !n) : di spat ch_a (q , n) 
else if (a — ?n) : 
head ;= n 
t erminat e 

while(true): do nothing // wrong guess 
"while(true): do nothing // c is reached 



Ctg type (a): 



Ctg type (b): 



Figure 7: Fifo system encoding into a serial asynchronous Qdas/ two types of 
Ctg in this case 

send of message m is simulated (line 16) by an asynchronous dispatch of block 
m to q. The simulation of a receive of m from q is more involved, as only the 
scheduler can decide to dequeue a block from q, and this can happen only if the 
current running block terminates (line 19). Still, we have to check that message 
m is indeed in the head of q. This is achieved by setting global variable head 
to TO, and letting the next dequeues block check that itself encodes the value 
stored into head. This is performed at line 10. If this test is not satisfied, the 
block jumps to the infinite loop of line 20, and the simulation ends. Otherwise, 
it proceeds with the simulation. Thus, in all reachable configurations oi Af-, 
a block TO (with m G M U {e} will reach line 21 iff c is reachable in F. This 
effectively reduces the control location reachability of fifo systems to the Parikh 
coverability problem of serial asynchronous Qdas. 

The proof of Theorem 4 relies on the next Lemma, that formalizes the re- 
lationship between reachable configurations of Af and reachable configurations 
of F. 

For all 7 G r, we denote by the location of TS^ that corresponds to line £ 
in Fig. 7. Then, we say that a configuration (G, d) of .4^? encodes a configuration 
(s, w) of F, written (G, d) > (s, w) iff: (i) s = c?(state), (m) G is of either shapes 
in Fig. 7 with w = toqTOi • • • to„, (iii) Parikh(G)(s^^j^jj) = 1 and (iv) there exists 
TO e MU{e} s.t. Parikh(G)(s^^) = 1. That is, s and w are encoded as described 
above, main is at line 8, and the running m block is at line 12. Then: 

Lemma 5 Let F be a FIFO system, let c be a configuration of F, and let Af 
be its associated Qdas. For all run (sq, wo)(si, wi) • • • (s„, w„) of F s.t. for 
all < i < n: Si ^ c, there exists {G,d) e Reach{AF) s.t. {G,d) [> (s„,z«„). 
Moreover, for all (G, d) G Reach^Ap) and for all configuration {s,w) of F: 
{G,d) t> {s,w) implies {s,w) G Reach{F) 

Proof. First, we consider a run (sq, wo)(si, wi) • ■ ■ {sn,Wn) of F s.t. for all 
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< i < n: Si ^ c, and build a run (Go, (io)ao(Gi, di)ai ■ ■ ■ ah-i{Gk,dk) of Af 
s.t. {Gk,dk) O (s„,w„), by induction on the length of F's run. 

Base case n = 0: Consider the run of Af that consists in executing lines 
5, 6, 7 of main (which sets the head variable to e), then dequeucing the e block 
from the queue, then executing lines 10 and 11 of e. Remark that the test at 
line 10 is not satisfied, as head = e, and that the queue is now empty. Clearly, 
the resulting configuration {G,d) [> (s^,wo) as wq = e. 

Inductive case n = £. Let us assume that there is a reachable configuration 
{G,d) of Af s.t. {G,d) D> (s^_i,W£_i), and let us build a sequence of Af 
transitions that is fireable from (G, d) and reaches a configuration encoding 
{st,W£). In (G, d), there is, by definition of >, a task running a b block, for 
6 e M U {e}, that is at line 12. Moreover, (i(state) = s^_i. Let 6 be the 

transition of F s.t. (sf_i,w^_i) {st,Wi). By hypothesis, ^ c, hence, we 
let b execute line 12; select S = (s^_i,a..sv) at line 13: execute line 14, where 
the condition of the if is not satisfied as s = S£_i = state; and execute line 
16, which reaches a configuration {G',d') where d'(state) = sg. We consider 
three cases to complete the simulation of S in Af- If a =ln, the b task performs 
an asynchronous dispatch of n to q, and jumps to line 11, then 12. Clearly, 
the resulting configuration {G",d") is s.t. {G",d") [> {si,we) (in particular, the 
dispatch has correctly updated the content of the queue). If a = e, the b tasks 
jumps directly to line 11, then to line 12. Again, the resulting configuration 
{G",d") is s.t. (G",d,") r> {sf,Wf), as the content of the queue has not been 
modified. Finally, if a =\n. the running b block sets head to n and terminates. 
Let (G",d") be the Ap configuration reached at that point. As 5 is fireable 
from (si-i,we-i) in F, since {G,d) > (s^_i, w^-i), and as the content of the 
queue has not been modified since then, the head of q is necessarily an n block 
in G" . Moreover. (i"(head) = n and (i"(state) = sg. Thus, we let the scheduler 
dequeue this n block, and we let the task running it execute line 10 (where 
the condition of the if is not satisfied), then line 11. Clearly, the resulting 
configuration encodes {si,we). 

Now, let p = (Go, do)ao{Gi,di)ai ■ ■ ■ a„_i(G„, d„) be a run of Af s.t. there 
is {s,w) with (G„,d„) l> {s,w), and let us build, by induction on the length of 
this run, a run {sp, Wo){si,wi) ■ ■ ■ {sk, Wk) a run of F s.t. (sfc, Wk) = (s, w). 

het K = \ \{Gi,di) I Parikh(Gi)(s^2) = 1 for m e MU |, i.e., K is the 
number of times an m block reaches line 12 along p. Let us consider the increas- 
ing monotonic function p : {1, . . . , K} — >• {0, . . . , n} s.t. for all < « < n: there 
exists m e M\J{e} s.t. Parikh(G,)(s^2) = i iff there \sl<j<K s.t. Gi = p{j), 
that is, p{i) is the index, in p of the ith time a configuration is reached where 
an m block is at hue 12. Clearly, by definition of > only the (G^^), c^p(j)) con- 
figurations (for 1 < j < K) can encode a configuration of F, as no m block is 
at line 12 in the other configurations of p. So, it is sufficient to show that all 
those {Gp(^j-j,dp(j-f) configurations encode a reachable configuration of F. We 

proceed by induction on j, and show that: for all 1 < j < K: {G p(^j),dp(^j)) 
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encodes a reachable configuration of F and G^q) contains exactly one m task 
(for TO e M U {e}), that has been dequeued from q. 

Base case j = 0: Observe that the subrun (Go, do)ao - ■ ■ ap(i)_i(Gp(i), dp(i)) 
is necessarily an initialization phase where main sets state to s^, head to e, 
dispatches an e block, and reaches line 8, where it will stay forever. Then, 
the scheduler dequeues the s block, which empties the queue. The s task 
then traverses line 10 (as head= s) and 11 and reaches line 12. So, clearly 
(Gp(o) ) ^^(9(0) ) (^F' ^) contains exactly one m task (for m e M U {e}), that 
has been dequeued from q. 

Inductive case j = t. Let us assume that (Gp(£_i), (ip(^_i)) encodes 
a reachable configuration (s£_i,w^_i) of F. We consider several cases. If 
(Gp(^_i), (ip(^_i)) = {Gp(^£^,dp(^£^) we are done. Otherwise, we have necessar- 
ily performed one iteration (possibly interrupted at line 12, 14 or 19) of the 
while loop at line 11 between (Gp(£_i), rfp(£_]^-)) and (Gp((;), (ip(^-)), as, by induc- 
tion hypothesis, Gp(^_i) contains exactly one to task (with to G M U {e}) that 
blocks q, and main can only loop at line 8, which does not modify the current 
configuration. Then, observe that the conditions of the if at lines 12 and 14 
were necessarily false during the iteration. Otherwise, to would have reached 
line 21, from which it cannot escape. Prom that point, no configuration is reach- 
able where an m block is at line 12 , and (Gp(eH), rfp(^)) cannot exist. Thus, we 
consider three cases: 

• If we have entered the if at line 16 during the iteration, then a transi- 
tion of the form (s. In, s') has been guessed, with state = s and a dis- 
patch of n has been performed into q. As (Gp(£_i), rfp(£_i)) [> (s^_i, 

by induction hypothesis, sg-i = s, and thus (s,!n, s') is fireable from 
(s^_i,w^_i) and reaches {s',n ■ wg-i). Clearly, this configuration is en- 
coded by (Gp(^),(?p(^)). 

• If we have entered the else if at line 17 during the iteration, then a 

transition of the form (s, ?n, s') has been guessed, with state ~ s, head 
has been set to n, the current m block has been terminated, a new block 
to' has been dequeued by the scheduler (as there is necessarily a running 
m block in Gp(^)). Moreover to' = n, because to' has to be at line 12 in 
Gpi^i), so the test of line 10 had to be false to allow to' to reach line 12. As 
(Gp(£_i), dp(£_i)) [> (s£_i,W£_i) by induction hypothesis, s^_i = s. As a 
dequeue of a block m! = n has been performed, w^-i is of the form w ■ n. 
Thus, (s, ?TO, s') is fireable from (s£_i,W£_i) and reaches {s',w). Clearly, 
this configuration is encoded by {Gp(£-j,dp(f^). 

• Finally, if neither the if nor the else if have been entered during the 
iteration, then a transition of the form (s, e, s') has been guessed, with 
state = s. As (Gp(^_i), dp(£_i)) \> (,S£_i,W£_i) by induction hypothe- 
sis, se-i = s, and thus (s,e, s') is fireable from {se-i,Wi^i) and reaches 
{s',we-i). Clearly, this configuration is encoded by (Gp(£), (ip(£)). □ 
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We can now prove Theorem 4: Proof. Let F he a. FIFO system, with 
set of messages M and associated serial asynchronous Qdas Af and let c be 
a control location of F. For all m e M U {e}, let fm be the Parikh image s.t. 
/m(-*main) = ^ ^^'^ .fm{-'^) = for all s -Smlin- R-Cmark that there arc only 
finitely many such J^. Then, we show that c is reachable in F iff there exists 
m G ML) {e} s.t. fm is Parikh-coverable in Af- 

Assume c is reachable in F, and let (c, w;) be a configuration in Reach(F). 
Without loss of generality, assume c is reachable by run that visits c only once. 
By Lemma 5, there is {G,d) G Reach{AF) s.t. {G,d) > {c,w). Hence, in {G,d), 
there is a task running an m block (for to G M U {e}) that is at line 12, and 
d(state) = c. Thus, m can execute one step and reach line 21, so is Parikh 
coverable in Af- 

For the reverse direction, assume there is to G M U {e} that is Paxikh- 
coverable in ^ir. Hence, there is {G,d) € Reach{AF) where a task running 
block to is at line 21. The only way for that block to reach line 21 is from 
line 12, with a valuation d' s.t. d'(state) = c. Thus, there is, in ReachlAp) a 
configuration {G',d') with d'(state) = c, a task running an to block at line 12, 
and necessarily main at line 8 (otherwise, only main would be running). Hence, 
{G',d') is a reachable configuration of Af s.t. {G',d') > {c,w) for some queue 
content w. Thus, by Lemma 5, (c, u>) G Reach{F), and c is reachable in F. 

We have thus reduced the control location reachability problem of FIFO 
systems to the Parikh coverability problem of serial asynchronous Qdas (using 
only one serial queue). The former is undecidable. Hence the theorem. □ 

B.3 Concurrent Qdas 

We reduce the reachability problem of two counter systems. Let us give the 
intuition of the construction. For each V, we construct a Qdas A-p s.t. all 
reachable Ctg in A-p encode configurations of V and are of the form depicted 
in Fig. 8. That is, (after an initialization phase), there are always three tasks 
that are unblocked: a main task to simulate P's control structure, and, for each 
i = {1,2}, either a task eins{i) or a task null{i). If the task null{i) is unblocked, 
then counter i is zero in the current configuration of V- Otherwise, the current 
valuation of counter i is encoded by the number of eins{i) tasks in the Ctg. 
Remark that, as in the case of synchronous Qdas, the parts of the Ctg that 
encode each counter behave as pushdown stacks. Finally, the control location 
of V is recorded in global variable state. 

The actual operations on the counters will be simulated by the eins{i) and 
null{i) running tasks. As main simulates the control structure, we need to 
synchronize mam with those eins{i) and null{i) tasks. Let us explain intuitively 
how we can achieve rendezvous synchronization between running tasks using 
global variables of Qdas. Consider a Qdas with three global variables t\, li 
ranging over Boolean and X over a finite set of 'messages' M. Let 71 and 72 be 
two blocks whose Lts are: 

71: O Q Q O O (forTOGM) 
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1 global state 

2 global i\ , l\ , // rdvz channel 1 

3 global i\ , ^2 > rdvz channel 2 

4 global c_ queue q 

6 def main () : 

7 foreach i in {1,2}: 

8 dispatch_a(q, null CD) 

9 i?ack 
10 state : = 



12 while (true ) ; 

13 select (s,a,s') ^ A-p where state 

15 if a — incr(l) : 

16 1 ! incr 

17 l?ack 

18 state ; =s' 

20 \\ other act ions analogous 

21 ... 



Figure 8: From a two counter system to a Qdas: main and null{i) ^ eins[i) for 
i = 1,2 

Assume a configuration c of tlie Qdas where £i = £2 — and where two distinct 
tasks are running 71 and 72, are unblocked, and are in sq and Sq respectively. 
Assume that no other task can access £1, £2 and m. It is easy to check that, 
from c, there is only one possible interleaving of the transitions of7i and 72 . So 
if 72 reaches S5 from c, then 71 must have reached S5, and the x — m test in 
71 has been fired after the x ^ m assignment in 72. This achieves rendezvous 
synchronisation between 71 and 72, with the passing of message m. This can 
easily be extended to rendezvous via different "channels" , by adding extra global 
variables. So, we extend the syntax of Qdas by allowing transitions of the form 
(so,c!m, S5) and (sg,c?m, Sg) (for m G M) to denote respectively a send and a 
receive of message m on a rendezvous channel c. 

We rely on this mechanism to let main send operations to be performed 
on the counters to the null{i) and eins{i) running tasks. More precisely, for 
a 2Cs P = (X,a;0,Sp, Ap), we build the Qdas Av = (Cg/Z?, 0, F, main, 
S,(T5^)-yer) where CQID = {q}, T = {{nuU,eins} x {1,2}) U {main}, X = 
{£{,£2, x^,£i, ^2) where x^,x^ range over the domain {incr, deer, is_zero, ack}, 
and the transition systems are given in Fig. 8. The variables X encode two 
channels that we call 1 and 2 in the pseudo code of Fig. 8. The main task runs 
an infinite while loop (line 12 onwards) that consists in guessing a transition 
(s, a, s') of F and synchronising, via rendezvous on the channels 1 and 2, with 
the relevant null or eins unblocked task, to let it execute the operation on the 
counter. When a null(i) or eins(i) receives an incr message, it performs an 
asynchronous dispatch of ems(i) into q to increment counter i, and acknowledges 
the operation to main, thanks to message ack. When an eins block receives a 
deer message, it terminates, which decrements the counter, null blocks cannot 
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receive deer messages, so, if main requests a deer operation when the counter is 
zero, main gets blocked. This means that the guessed transition was not fireable 
in the currently simulated 2Cs configuration, and ends the simulation. Finally, 
only null blocks can receive and acknowledge is_zero messages, so. again, main 
is blocked after sending is_zero to a non-zero counter. Note that we need both 
asynchronous calls to start two counters in parallel, and synchronous calls to 
encode the counter values. The result of Theorem 5 follows directly from: 

Proposition 10 Given a 2Gs, then we can reduce its reachability question to 
the Parikh coverability question for a concurrent Qdas that demands both syn- 
chronous and asynchronous dispatch actions. 

As discussed before, we can separate each G for {G,d) G Reach{Ac) into 

three components, one consisting only of a vertex vq with X{vq) = main and 
two paths V1V2 ■ ■ - Vk and v'-^v^ ■ ■ - v'l which we will call counterl and counter2 
in the following. 

As before, we define a relation between configurations of the 2Cs C and 
the Qdas Acc- For c = {G,d) e Reach{Ac) and y = {x,k,l) G Reach{C) C 
X X N X N we write coy if d{state) = x, \counterl\ = k, and \counter2\ = I. 

The rendezvous assures a unique interleaving of actions of m,ain, null{l), and 
null{2) until main reaches line 12. Let us in the following consider the reached 
configuration c° = iG'^,dP) with S^{state) = (^(fo) = ^(h) = and G° 
with 



(where ,si2 is the state of main in line 12) as "initial" configuration of the Qdas. 

Note that counterl and counter2 are independent, i.e., they do not syn- 
chronize except via main. Further, there is no more than one task active in 
counterl and counter2. The unique tasks zero(l) and zero(2) never terminate. 
The rendezvous synchronization assures that there is only one possible inter- 
leaving between the main task and the currently running tasks in counterl and 
counter2: 

• m,ain does loops of the form 

@ "°'"°Q "° Q "^^^ Q"°""°'e (forie{l,2},aeEc) 

• which leads to the following interleaving of actions of main with actions of 

the ?'-th coimter component. 

where O >0 translates the sent action a to a meta-action (a) of the i-th 
counter as follows: 



• an action incr is mapped to the action dispatchs(g, ems(i)) and the 
activation of the dispatched task 
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• an action deer is mapped to the termination of the current task; which 
is only possible if the current task is a block eins{i) 

• the test for empty stack is mapped to an epsilon action; this action is 
only possible in null{i). 

Note that if Ci{a) is not possible, then there will be no acknowledgement, hence 
Ac blocks. 

Thus we can cut a run of Ac into (an initial phase and) a sequence of phases 
of the above form that will be abbreviated trans{s, a, s') in the following. 

Lemma 6 Let C be a 2Cs and Ac the associated Qdas, if y G Reach{C) then 
there exists c € Reach{Ac) such that oy. Further, if c= {G,d) € Reach{Ac) 
where d valuates d{io) = d{£i) = 0, then there exists y G Reach{C) with oy. 

Proof. 

Given a run Xo5iXi52 . ■ . S^Xk of C, then there exists a run of Ac that can be 
cut into phases ti,. . . ,tk where ti = trans{si-i, a-i, Si) where 5i — (si_i, Oj, Sj) 
for 1 < i < fc. Obviously c" [> x° and (P{tQ) = d!^{£i) =0. Hence, the reverse 
direction follows by a straightforward inductive argument. 
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